source("./Mean Reversion/RMR.001 Load Packages.R") pricing_data <- read_csv("./Mean Reversion/Raw Data/pricing data.csv") ## Parsed with column specification:
## cols(
## date_unix = col_integer(),
## date_time = col_datetime(format = ""),
## high = col_double(),
## low = col_double(),
## open = col_double(),
## close = col_double(),
## volume = col_double(),
## quote_volume = col_double(),
## weighted_average = col_double(),
## currency_pair = col_character(),
## period = col_integer()
## )
Description
Spreads Poloneix pricing data into wide format and filters data to a specified time resolution and time window.
Arguments
pricing_data: A dataframe containing pricing data from Poloneix gathered in tidy format.
time_resolution: The number of seconds that each observation spans. Takes values 300, 900, 1800, 7200, 14400, and 86400.
start_date: The start date of the time window.
end_date: The end date of the time window.
prepare_data <- function(pricing_data, time_resolution, start_date, end_date) {
df <- pricing_data %>%
filter(period == time_resolution,
date_time >= start_date,
date_time <= end_date) %>%
select(date_unix, date_time, close, currency_pair) %>%
spread(currency_pair, close)
return(df)
} Description
The Engle-Granger method is used to test for cointegration. This method is comprised of two steps: (1) Perform a linear regression of log(coin_y) on log(coin_x). (2) Perform an Augmented Dickey-Fuller test on the residuals from the linear regression estimated in (1). The ADF test specification is of a non-zero mean, no time-based trend, and one autoregressive lag. The function returns the ADF test statistic.
Arguments
coin_y: A vector containing the pricing data for the dependent coin in the regression.
coin_x: A vector containing the pricing data for the independent coin in the regression.
test_cointegration <- function(coin_y, coin_x) {
lm_model <- lm(log(coin_y) ~ log(coin_x))
lm_residuals <- lm_model[["residuals"]]
adf_test <- ur.df(lm_residuals, type = "drift", lags = 1)
df_stat = adf_test@testreg[["coefficients"]][2, 3]
return(df_stat)
} Description
Two sets of currency pairs are examined: currency pairs where USDT is the quote currency and currency pairs where BTC is the quote currency. All combinations of coins are created within a given quote currency. Combinations that consist of the coin with itself are removed. The function returns a dataframe containing the coin pairs.
Arguments
quote_currency: A string indicating the quote currency of the currency pairs. Can take values USDT or BTC.
create_pairs <- function(quote_currency) {
if (quote_currency == "USDT") {
coin_list <- c("USDT_BTC", "USDT_DASH", "USDT_ETH", "USDT_LTC", "USDT_REP", "USDT_XMR", "USDT_ZEC")
}
if (quote_currency == "BTC") {
coin_list <- c("BTC_DASH", "BTC_ETH", "BTC_LTC", "BTC_REP", "BTC_XEM", "BTC_XMR", "BTC_ZEC")
}
coin_pairs <- expand.grid(coin_list, coin_list) %>%
rename(coin_y = Var1,
coin_x = Var2) %>%
filter(coin_y != coin_x) %>%
mutate_if(is.factor, as.character) %>%
as_tibble()
return(coin_pairs)
} Description
Test for cointegration between each coin pair generated by the create_pairs() function. The test for cointegration is performed by the test_cointegration() function. The function returns a dataframe containing the coin pairs and the ADF test statistic resulting from testing cointegration between each coin pair.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pairs.
coin_pairs: A dataframe generated by create_pairs().
test_pairs <- function(train, coin_pairs) {
adf_stat <- c()
for (n in 1:nrow(coin_pairs)) {
coin_y <- coin_pairs[[n, "coin_y"]]
coin_x <- coin_pairs[[n, "coin_x"]]
cointegration_results <- test_cointegration(coin_y = train[[coin_y]], coin_x = train[[coin_x]])
adf_stat <- c(adf_stat, cointegration_results)
}
df <- coin_pairs %>%
mutate(adf_stat = adf_stat) %>%
arrange(adf_stat)
return(df)
} Description
Select cointegrated coin pairs to be used in a mean reversion strategy. The current coin selection logic is to select all coins where the ADF test statistic is less than -2.57.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pair.
coin_pairs: A dataframe generated by create_pairs().
select_pairs <- function(train, coin_pairs) {
set.seed(5)
df <- test_pairs(train = train, coin_pairs = coin_pairs) %>%
filter(adf_stat <= -3.43)
return(df)
} Description
Generate trading signals that indicate the current position in the spread formed by a linear combination of coin y and coin x. A signal of +1 indicates a long position in the spread, 0 indicates a flat position, and -1 indicates a short position in the spread. Signals are generated for the test set using a model trained on the training set.
The current trading logic is perform a linear regression of log(coin y) on log(coin x) using the training set. A spread is then calculated in the test set using the fitted hedge ratio and intercept from the regression. The z-score of the spread is then calculated using the mean and standard deviation from the training set. A position is entered when the z-score reaches +2 or -2 and is exited when the z-score reaches 0. Also exits losing positions when the z-score reaches +4 or -4 and re-enters the position when when it returns to within the +4 or -4 range.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pair.
test: A dataframe generated by prepare_data() that represents the test set for the coin pair.
coin_y: A string indicating the dependent coin in the coin pair regression.
coin_x: A string indicating the independent coin in the coin pair regression.
threshold_z: A number indicating the absolute value of the z-score threshold for entering a position in the spread.
generate_signals <- function(train, test, coin_y, coin_x, threshold_z) {
model <- lm(log(train[[coin_y]]) ~ log(train[[coin_x]]))
intercept <- coef(model)[1]
hedge_ratio <- coef(model)[2]
df_signals <- test %>%
mutate(spread = log(test[[coin_y]]) - log(test[[coin_x]]) * hedge_ratio - intercept,
spread_z = (spread - mean(model[["residuals"]])) / sd(model[["residuals"]]),
signal_long = ifelse(lag(spread_z, 1) <= -threshold_z, 1, NA),
signal_long = ifelse(lag(spread_z, 1) >= 0, 0, signal_long),
signal_long = ifelse(lag(spread_z, 1) <= -4, 0, signal_long),
signal_long = ifelse(lag(cummin(spread_z), 1) <= -4, 0, signal_long),
signal_long = na.locf(signal_long, na.rm = FALSE),
signal_short = ifelse(lag(spread_z, 1) >= threshold_z, -1, NA),
signal_short = ifelse(lag(spread_z, 1) <= 0, 0, signal_short),
signal_short = ifelse(lag(spread_z, 1) >= 4, 0, signal_short),
signal_short = ifelse(lag(cummax(spread_z), 1) >= 4, 0, signal_short),
signal_short = na.locf(signal_short, na.rm = FALSE),
signal = signal_long + signal_short,
signal = ifelse(is.na(signal), 0, signal))
return(df_signals[["signal"]])
} Description
Calculate the return of a cointegration-based mean reversion trading strategy using coin y and coin x.
The current backtesting logic uses signals generated by generate_signals(). The coin_y_return and coin_x_return indicate the one period percentage return of each coin. The coin_y_position and coin_x_position indicate the market value in USD in each coin. coin_y_pnl and coin_x_pnl indicate the USD value of the profit and loss for each coin. The combined_position indicates the gross market value of the combined positions.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pair.
test: A dataframe generated by prepare_data() that represents the test set for the coin pair.
coin_y: A string indicating the dependent coin in the coin pair regression.
coin_x: A string indicating the independent coin in the coin pair regression.
threshold_z: A number indicating the absolute value of the z-score threshold for entering a position in the spread.
backtest_pair <- function(train, test, coin_y, coin_x, threshold_z) {
model <- lm(log(train[[coin_y]]) ~ log(train[[coin_x]]))
intercept <- coef(model)[1]
hedge_ratio <- coef(model)[2]
df_backtest <- test %>%
mutate(signal = generate_signals(train = train,
test = test,
coin_y = coin_y,
coin_x = coin_x,
threshold_z = threshold_z),
coin_y_return = test[[coin_y]] / lag(test[[coin_y]], 1) - 1,
coin_x_return = test[[coin_x]] / lag(test[[coin_x]], 1) - 1,
coin_y_position = signal * 1 * 1,
coin_x_position = signal * hedge_ratio * -1,
coin_y_pnl = lag(coin_y_position, 1) * coin_y_return,
coin_x_pnl = lag(coin_x_position, 1) * coin_x_return,
combined_position = abs(coin_y_position) + abs(coin_x_position),
combined_pnl = coin_y_pnl + coin_x_pnl,
combined_return = combined_pnl / lag(combined_position, 1)) %>%
mutate_all(funs(ifelse(is.na(.), 0, .))) %>%
mutate(return_pair = cumprod(1 + combined_return))
return(df_backtest[["return_pair"]])
} Description
Calculate the return of a cointegration-based mean reversion trading strategy using an equally weighted portfolio of cointegrated coin pairs.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pair.
test: A dataframe generated by prepare_data() that represents the test set for the coin pair.
selected_pairs: A dataframe generated by select_coins() that represents a set of cointegrated coin pairs.
backtest_strategy <- function(train, test, selected_pairs, threshold_z) {
if (nrow(selected_pairs) == 0) {
return(1)
}
df <- tibble()
for (i in 1:nrow(selected_pairs)) {
single_pair <- tibble(
return_pair = backtest_pair(train = train,
test = test,
coin_y = selected_pairs[["coin_y"]][i],
coin_x = selected_pairs[["coin_x"]][i],
threshold_z = threshold_z),
coin_y = selected_pairs[["coin_y"]][i],
coin_x = selected_pairs[["coin_x"]][i],
date_time = test[["date_time"]]
)
df <- bind_rows(df, single_pair)
}
df <- df %>%
group_by(date_time) %>%
summarise(return_strategy = mean(return_pair))
return(df[["return_strategy"]])
} Description
Create plots of a cointegration-based mean reversion trading strategy of a single coin pair conprised of coin y and coin x. There are two plots created by this function. The first plot displays the spread transformed into z-score with three red lines at -2, 0, and 2. A green line indicates the signal which can take values -1, 0, and +1. The second plot displays the cumulative return of the model in blue. Two additional lines show the buy and hold return of coin y and coin x as red and green lines, respectively.
Arguments
train: A dataframe generated by prepare_data() that represents the training set for the coin pair.
test: A dataframe generated by prepare_data() that represents the test set for the coin pair.
coin_y: A string indicating the dependent coin in the coin pair regression.
coin_x: A string indicating the independent coin in the coin pair regression.
threshold_z: A number indicating the absolute value of the z-score threshold for entering a position in the spread.
plot_single <- function(train, test, coin_y, coin_x, threshold_z) {
model <- lm(log(train[[coin_y]]) ~ log(train[[coin_x]]))
intercept <- coef(model)[1]
hedge_ratio <- coef(model)[2]
df_plot <- test %>%
mutate(spread = log(test[[coin_y]]) - log(test[[coin_x]]) * hedge_ratio - intercept,
spread_z = (spread - mean(model[["residuals"]])) / sd(model[["residuals"]]),
signal = generate_signals(train = train,
test = test,
coin_y = coin_y,
coin_x = coin_x,
threshold_z = threshold_z),
return_pair = backtest_pair(train = train,
test = test,
coin_y = coin_y,
coin_x = coin_x,
threshold_z = threshold_z),
return_buyhold_y = test[[coin_y]] / test[[coin_y]][1],
return_buyhold_x = test[[coin_x]] / test[[coin_x]][1])
print(summary(model))
print(ggplot(df_plot, aes(x = date_time)) +
geom_line(aes(y = spread_z, colour = "Spread Z"), size = 1) +
geom_line(aes(y = signal, colour = "Signal"), size = 0.5) +
geom_hline(yintercept = 0, colour = "red", alpha = 0.5) +
geom_hline(yintercept = 2, colour = "red", alpha = 0.5) +
geom_hline(yintercept = -2, colour = "red", alpha = 0.5) +
scale_color_manual(name = "Series",
values = c("Spread Z" = "blue",
"Signal" = "green")) +
labs(title = "Spread vs Trading Signal",
subtitle = str_c(coin_y, " and ", coin_x),
x = "Date",
y = "Spread and Signal"))
print(ggplot(df_plot, aes(x = date_time)) +
geom_line(aes(y = return_pair, colour = "Model"), size = 1) +
geom_line(aes(y = return_buyhold_y, colour = "Coin Y"), size = 0.5, alpha = 0.4) +
geom_line(aes(y = return_buyhold_x, colour = "Coin X"), size = 0.5, alpha = 0.4) +
geom_hline(yintercept = 1, colour = "black") +
scale_color_manual(name = "Return",
values = c("Model" = "darkblue",
"Coin Y" = "darkred",
"Coin X" = "darkgreen")) +
labs(title = "Model Return vs Buy Hold Return",
subtitle = str_c(coin_y, " and ", coin_x),
x = "Date",
y = "Cumulative Return"))
} Description
Create many plots by calling the plot_single() function multiple times. Also creates a plot showing the results of the overall strategy. Creates a train and test set surrounding a cutoff date and creates plot for the top 10 selected coins ranked by their ADF statistic.
Arguments
pricing_data: A dataframe containing pricing data from Poloneix gathered in tidy format.
time_resolution: The number of seconds that each observation spans. Takes values 300, 900, 1800, 7200, 14400, and 86400.
cutoff_date: A data representing the cutoff date between the train and test sets.
train_window: A period object from the lubridate package representing the length of time the train set covers.
test_window: A period object from lubridate package representing the length of time the the test set covers. threshold_z: A number indicating the absolute value of the z-score threshold for entering a position in the spread.
plot_many <- function(pricing_data, time_resolution, cutoff_date, train_window, test_window, threshold_z) {
train <- prepare_data(pricing_data = pricing_data,
time_resolution = time_resolution,
start_date = as.Date(cutoff_date) - train_window,
end_date = as.Date(cutoff_date))
test <- prepare_data(pricing_data = pricing_data,
time_resolution = time_resolution,
start_date = as.Date(cutoff_date),
end_date = as.Date(cutoff_date) + test_window)
selected_pairs <- select_pairs(train = train,
coin_pairs = create_pairs(quote_currency = quote_currency))
if (nrow(selected_pairs) == 0) {
return("No coin pairs selected.")
}
print(selected_pairs)
for (i in 1:min(10, nrow(selected_pairs))) {
plot_single(train = train,
test = test,
coin_y = selected_pairs[["coin_y"]][i],
coin_x = selected_pairs[["coin_x"]][i],
threshold_z = threshold_z)
}
test <- test %>%
mutate(return_strategy = backtest_strategy(train = train,
test = .,
selected_pairs = selected_pairs,
threshold_z = threshold_z))
ggplot(test, aes(x = date_time)) +
geom_line(aes(y = return_strategy, colour = "Strategy"), size = 1) +
geom_line(aes(y = USDT_BTC / USDT_BTC[1], colour = "USDT_BTC"), size = 0.5, alpha = 0.4) +
geom_hline(yintercept = 1, colour = "black") +
scale_color_manual(name = "Return",
values = c("Strategy" = "darkblue",
"USDT_BTC" = "darkred")) +
labs(title = "Strategy Return vs Buy Hold Return",
x = "Date",
y = "Cumulative Return")
} quote_currency <- "USDT"
time_resolution <- 900
train_window <- days(32)
test_window <- days(16)
test_by <- "16 days"
threshold_z <- 2 plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-09-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 3 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_REP USDT_ZEC -5.340746
## 2 USDT_ZEC USDT_REP -5.282765
## 3 USDT_REP USDT_ETH -3.471493
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.189890 -0.036778 -0.000273 0.030813 0.247341
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.584037 0.042463 -60.85 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.043631 0.007849 132.96 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05619 on 3071 degrees of freedom
## Multiple R-squared: 0.852, Adjusted R-squared: 0.8519
## F-statistic: 1.768e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.183978 -0.032501 -0.006733 0.033032 0.166360
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.90999 0.01881 154.7 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.81637 0.00614 133.0 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0497 on 3071 degrees of freedom
## Multiple R-squared: 0.852, Adjusted R-squared: 0.8519
## F-statistic: 1.768e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.13510 -0.04964 -0.01425 0.03677 0.28900
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.310058 0.046970 -27.89 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.770269 0.008275 93.08 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.07472 on 3071 degrees of freedom
## Multiple R-squared: 0.7383, Adjusted R-squared: 0.7382
## F-statistic: 8664 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-08-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 8 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_ETH USDT_ZEC -4.462054
## 2 USDT_ZEC USDT_REP -4.361885
## 3 USDT_REP USDT_ZEC -4.335836
## 4 USDT_ZEC USDT_ETH -4.290393
## 5 USDT_DASH USDT_XMR -4.016416
## 6 USDT_XMR USDT_DASH -4.000072
## 7 USDT_ETH USDT_REP -3.695401
## 8 USDT_REP USDT_ETH -3.464286
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.179487 -0.021586 0.003705 0.026061 0.106185
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.410750 0.018182 77.59 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.742691 0.003393 218.91 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03676 on 3071 degrees of freedom
## Multiple R-squared: 0.9398, Adjusted R-squared: 0.9398
## F-statistic: 4.792e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.237564 -0.026330 -0.000836 0.041292 0.252928
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.153572 0.020108 107.1 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.058572 0.006636 159.5 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06416 on 3071 degrees of freedom
## Multiple R-squared: 0.8923, Adjusted R-squared: 0.8923
## F-statistic: 2.545e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.237726 -0.036235 -0.005994 0.034868 0.195045
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.489552 0.028320 -52.6 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.842935 0.005284 159.5 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05726 on 3071 degrees of freedom
## Multiple R-squared: 0.8923, Adjusted R-squared: 0.8923
## F-statistic: 2.545e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.133493 -0.039498 -0.001783 0.034247 0.205442
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.46257 0.03116 -46.94 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.26537 0.00578 218.91 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04798 on 3071 degrees of freedom
## Multiple R-squared: 0.9398, Adjusted R-squared: 0.9398
## F-statistic: 4.792e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.108765 -0.034745 0.000672 0.029316 0.158594
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.909297 0.030249 63.12 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.883103 0.008161 108.21 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05058 on 3071 degrees of freedom
## Multiple R-squared: 0.7922, Adjusted R-squared: 0.7922
## F-statistic: 1.171e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.164998 -0.036084 -0.006958 0.042302 0.116351
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.94302 0.04296 -21.95 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.89708 0.00829 108.21 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05098 on 3071 degrees of freedom
## Multiple R-squared: 0.7922, Adjusted R-squared: 0.7922
## F-statistic: 1.171e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.20745 -0.02401 0.01148 0.03494 0.16101
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.017909 0.019176 157.4 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.783639 0.006329 123.8 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06119 on 3071 degrees of freedom
## Multiple R-squared: 0.8331, Adjusted R-squared: 0.8331
## F-statistic: 1.533e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.20918 -0.04682 -0.01119 0.05946 0.19466
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.703749 0.046282 -58.42 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.063160 0.008586 123.83 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.07127 on 3071 degrees of freedom
## Multiple R-squared: 0.8331, Adjusted R-squared: 0.8331
## F-statistic: 1.533e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-07-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 8 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_REP USDT_XMR -5.734962
## 2 USDT_XMR USDT_REP -5.685226
## 3 USDT_BTC USDT_REP -4.539502
## 4 USDT_REP USDT_BTC -4.485586
## 5 USDT_BTC USDT_XMR -4.190621
## 6 USDT_XMR USDT_BTC -4.063350
## 7 USDT_DASH USDT_ZEC -3.551029
## 8 USDT_DASH USDT_LTC -3.535247
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.210508 -0.031649 -0.000937 0.030277 0.169426
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.61487 0.03706 -43.58 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.28768 0.00960 134.14 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0562 on 3071 degrees of freedom
## Multiple R-squared: 0.8542, Adjusted R-squared: 0.8542
## F-statistic: 1.799e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.123985 -0.019981 0.002444 0.026076 0.180642
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.633839 0.016603 98.41 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.663368 0.004945 134.14 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04034 on 3071 degrees of freedom
## Multiple R-squared: 0.8542, Adjusted R-squared: 0.8542
## F-statistic: 1.799e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.124641 -0.021271 0.004241 0.020682 0.085012
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.264116 0.013807 453.7 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.469285 0.004113 114.1 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03355 on 3071 degrees of freedom
## Multiple R-squared: 0.8092, Adjusted R-squared: 0.8091
## F-statistic: 1.302e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.14718 -0.04983 -0.01338 0.05001 0.20053
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.16068 0.11844 -85.78 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.72423 0.01511 114.11 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0643 on 3071 degrees of freedom
## Multiple R-squared: 0.8092, Adjusted R-squared: 0.8091
## F-statistic: 1.302e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.089384 -0.026172 0.003299 0.022341 0.085533
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.286158 0.021008 251.6 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.661334 0.005442 121.5 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03186 on 3071 degrees of freedom
## Multiple R-squared: 0.8278, Adjusted R-squared: 0.8278
## F-statistic: 1.477e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.11391 -0.03020 0.00318 0.03407 0.14419
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.95274 0.08074 -73.72 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.25177 0.01030 121.52 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04384 on 3071 degrees of freedom
## Multiple R-squared: 0.8278, Adjusted R-squared: 0.8278
## F-statistic: 1.477e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.192122 -0.032241 0.007943 0.041034 0.196809
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.629257 0.035553 45.83 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.601409 0.006213 96.80 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06764 on 3071 degrees of freedom
## Multiple R-squared: 0.7532, Adjusted R-squared: 0.7531
## F-statistic: 9371 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.21086 -0.03903 -0.01176 0.02116 0.24807
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.196284 0.021880 146.08 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.533891 0.006227 85.74 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0739 on 3071 degrees of freedom
## Multiple R-squared: 0.7054, Adjusted R-squared: 0.7053
## F-statistic: 7352 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-06-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 14 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_REP USDT_DASH -6.302270
## 2 USDT_DASH USDT_REP -6.164980
## 3 USDT_DASH USDT_ZEC -4.838517
## 4 USDT_REP USDT_ZEC -4.765335
## 5 USDT_ZEC USDT_DASH -4.419407
## 6 USDT_DASH USDT_XMR -4.338766
## 7 USDT_REP USDT_XMR -4.334303
## 8 USDT_XMR USDT_DASH -4.299602
## 9 USDT_ZEC USDT_REP -4.132672
## 10 USDT_XMR USDT_REP -4.062863
## 11 USDT_XMR USDT_ZEC -3.974349
## 12 USDT_DASH USDT_ETH -3.746119
## 13 USDT_XMR USDT_ETH -3.539918
## 14 USDT_ZEC USDT_XMR -3.518425
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.32075 -0.04357 0.00580 0.04444 0.25546
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.093575 0.040614 -51.55 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.085528 0.008785 123.57 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06478 on 3071 degrees of freedom
## Multiple R-squared: 0.8325, Adjusted R-squared: 0.8325
## F-statistic: 1.527e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.246518 -0.040099 -0.004599 0.032748 0.293567
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.379505 0.018168 131.0 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.766951 0.006207 123.6 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05445 on 3071 degrees of freedom
## Multiple R-squared: 0.8325, Adjusted R-squared: 0.8325
## F-statistic: 1.527e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.31848 -0.04139 0.00042 0.04379 0.19210
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.981247 0.015789 188.8 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.339406 0.003259 104.1 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06251 on 3071 degrees of freedom
## Multiple R-squared: 0.7793, Adjusted R-squared: 0.7792
## F-statistic: 1.084e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.279354 -0.047233 0.001543 0.046953 0.281885
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.991651 0.019443 51.00 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.399687 0.004014 99.58 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.07698 on 3071 degrees of freedom
## Multiple R-squared: 0.7635, Adjusted R-squared: 0.7635
## F-statistic: 9916 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.47825 -0.11628 -0.02548 0.06762 0.74612
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.77880 0.10194 -56.69 <0.0000000000000002 ***
## log(train[[coin_x]]) 2.29608 0.02205 104.14 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1626 on 3071 degrees of freedom
## Multiple R-squared: 0.7793, Adjusted R-squared: 0.7792
## F-statistic: 1.084e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.24382 -0.05767 -0.01469 0.05710 0.21907
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.490431 0.023475 106.09 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.615545 0.006772 90.89 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06927 on 3071 degrees of freedom
## Multiple R-squared: 0.729, Adjusted R-squared: 0.7289
## F-statistic: 8262 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.17542 -0.06927 -0.02101 0.08183 0.36063
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.483450 0.030581 15.81 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.704709 0.008822 79.88 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09023 on 3071 degrees of freedom
## Multiple R-squared: 0.6751, Adjusted R-squared: 0.675
## F-statistic: 6381 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26796 -0.06164 0.02596 0.07016 0.26309
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.01152 0.06024 -33.39 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.18435 0.01303 90.89 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09608 on 3071 degrees of freedom
## Multiple R-squared: 0.729, Adjusted R-squared: 0.7289
## F-statistic: 8262 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.39648 -0.11677 -0.03566 0.06922 0.58622
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.75186 0.05615 -13.39 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.91035 0.01918 99.58 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1683 on 3071 degrees of freedom
## Multiple R-squared: 0.7635, Adjusted R-squared: 0.7635
## F-statistic: 9916 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.32131 -0.04028 0.01142 0.07792 0.22094
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.66155 0.03510 18.85 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.95798 0.01199 79.88 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1052 on 3071 degrees of freedom
## Multiple R-squared: 0.6751, Adjusted R-squared: 0.675
## F-statistic: 6381 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-05-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 23 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_LTC USDT_DASH -5.247891
## 2 USDT_LTC USDT_BTC -5.055356
## 3 USDT_REP USDT_ETH -4.910632
## 4 USDT_DASH USDT_ZEC -4.609111
## 5 USDT_DASH USDT_ETH -4.531939
## 6 USDT_LTC USDT_ETH -4.516577
## 7 USDT_XMR USDT_ZEC -4.484302
## 8 USDT_LTC USDT_ZEC -4.362703
## 9 USDT_DASH USDT_REP -4.351660
## 10 USDT_REP USDT_DASH -4.303321
## # ... with 13 more rows
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.16346 -0.05209 0.02656 0.10919 0.36281
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.18276 0.14881 -21.39 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.30622 0.03459 37.76 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2275 on 3071 degrees of freedom
## Multiple R-squared: 0.3171, Adjusted R-squared: 0.3168
## F-statistic: 1426 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.49274 -0.07323 0.00271 0.08408 0.36510
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -18.32546 0.15969 -114.8 <0.0000000000000002 ***
## log(train[[coin_x]]) 2.91569 0.02243 130.0 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.108 on 3071 degrees of freedom
## Multiple R-squared: 0.8462, Adjusted R-squared: 0.8462
## F-statistic: 1.69e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.294547 -0.034051 -0.002455 0.043462 0.299088
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.151332 0.036864 -31.23 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.925974 0.009372 98.80 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.07627 on 3071 degrees of freedom
## Multiple R-squared: 0.7607, Adjusted R-squared: 0.7606
## F-statistic: 9762 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.40903 -0.02999 0.00315 0.02929 0.21342
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.294081 0.036772 35.19 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.708576 0.008663 81.79 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06657 on 3071 degrees of freedom
## Multiple R-squared: 0.6854, Adjusted R-squared: 0.6853
## F-statistic: 6690 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.44208 -0.03115 0.01506 0.03302 0.13137
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.540921 0.028427 54.20 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.701951 0.007227 97.13 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05881 on 3071 degrees of freedom
## Multiple R-squared: 0.7544, Adjusted R-squared: 0.7543
## F-statistic: 9434 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.01640 -0.04937 0.02866 0.12096 0.38388
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.7025 0.1101 -15.46 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.0524 0.0280 37.59 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2278 on 3071 degrees of freedom
## Multiple R-squared: 0.3151, Adjusted R-squared: 0.3149
## F-statistic: 1413 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.078881 -0.020784 -0.003911 0.014606 0.092846
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.506220 0.017556 85.80 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.366222 0.004136 88.54 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03178 on 3071 degrees of freedom
## Multiple R-squared: 0.7185, Adjusted R-squared: 0.7184
## F-statistic: 7840 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.84199 -0.09483 -0.00751 0.14060 0.37788
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.69720 0.10427 -35.46 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.44527 0.02457 58.83 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1888 on 3071 degrees of freedom
## Multiple R-squared: 0.5299, Adjusted R-squared: 0.5297
## F-statistic: 3461 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.48670 -0.04195 0.01011 0.03820 0.28170
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.91496 0.02336 124.78 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.55663 0.00937 59.41 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08096 on 3071 degrees of freedom
## Multiple R-squared: 0.5347, Adjusted R-squared: 0.5346
## F-statistic: 3529 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.39145 -0.06261 0.00097 0.06899 0.46005
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.64230 0.06956 -23.61 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.96061 0.01617 59.41 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1064 on 3071 degrees of freedom
## Multiple R-squared: 0.5347, Adjusted R-squared: 0.5346
## F-statistic: 3529 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-04-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z)## # A tibble: 6 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_BTC USDT_ZEC -3.921811
## 2 USDT_ZEC USDT_BTC -3.894445
## 3 USDT_DASH USDT_XMR -3.786044
## 4 USDT_XMR USDT_DASH -3.627830
## 5 USDT_XMR USDT_ZEC -3.562814
## 6 USDT_ZEC USDT_XMR -3.508866
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.20449 -0.03945 0.00446 0.04611 0.09902
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.306614 0.014881 558.21 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.327595 0.003808 -86.04 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05214 on 3071 degrees of freedom
## Multiple R-squared: 0.7068, Adjusted R-squared: 0.7067
## F-statistic: 7402 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.56203 -0.10207 0.02872 0.09563 0.36529
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.06503 0.17627 108.16 <0.0000000000000002 ***
## log(train[[coin_x]]) -2.15748 0.02508 -86.04 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1338 on 3071 degrees of freedom
## Multiple R-squared: 0.7068, Adjusted R-squared: 0.7067
## F-statistic: 7402 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.37070 -0.09691 0.01128 0.08815 0.39283
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.27203 0.03359 -8.098 0.000000000000000798 ***
## log(train[[coin_x]]) 1.58285 0.01176 134.608 < 0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1393 on 3071 degrees of freedom
## Multiple R-squared: 0.8551, Adjusted R-squared: 0.855
## F-statistic: 1.812e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.261112 -0.044455 0.003694 0.058712 0.253697
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.559809 0.017068 32.8 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.540213 0.004013 134.6 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08139 on 3071 degrees of freedom
## Multiple R-squared: 0.8551, Adjusted R-squared: 0.855
## F-statistic: 1.812e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.25227 -0.05376 -0.01961 0.03641 0.39876
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.100768 0.029653 -3.398 0.000687 ***
## log(train[[coin_x]]) 0.756217 0.007587 99.669 < 0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1039 on 3071 degrees of freedom
## Multiple R-squared: 0.7639, Adjusted R-squared: 0.7638
## F-statistic: 9934 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.41215 -0.07308 0.04379 0.08637 0.35360
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.02284 0.02895 35.33 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.01010 0.01013 99.67 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1201 on 3071 degrees of freedom
## Multiple R-squared: 0.7639, Adjusted R-squared: 0.7638
## F-statistic: 9934 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-03-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 12 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_LTC USDT_ZEC -4.204591
## 2 USDT_XMR USDT_BTC -4.108816
## 3 USDT_XMR USDT_LTC -4.039288
## 4 USDT_LTC USDT_REP -3.989227
## 5 USDT_XMR USDT_DASH -3.973108
## 6 USDT_LTC USDT_XMR -3.948913
## 7 USDT_XMR USDT_ETH -3.862021
## 8 USDT_LTC USDT_ETH -3.852363
## 9 USDT_XMR USDT_ZEC -3.813835
## 10 USDT_LTC USDT_DASH -3.712336
## 11 USDT_XMR USDT_REP -3.697045
## 12 USDT_LTC USDT_BTC -3.596645
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.073839 -0.017634 0.003176 0.016558 0.048787
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.954515 0.011342 84.16 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.114562 0.003222 35.55 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.01962 on 3071 degrees of freedom
## Multiple R-squared: 0.2916, Adjusted R-squared: 0.2914
## F-statistic: 1264 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.107717 -0.027904 -0.005535 0.026504 0.087241
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.597869 0.058801 61.19 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.153152 0.008464 -18.09 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03564 on 3071 degrees of freedom
## Multiple R-squared: 0.09634, Adjusted R-squared: 0.09604
## F-statistic: 327.4 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.10417 -0.02179 -0.00559 0.01850 0.09753
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.76882 0.03691 47.92 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.56361 0.02718 20.73 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03511 on 3071 degrees of freedom
## Multiple R-squared: 0.1228, Adjusted R-squared: 0.1225
## F-statistic: 429.9 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.091934 -0.013630 0.004628 0.015181 0.061119
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.511292 0.007737 195.34 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.101738 0.005114 -19.89 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02194 on 3071 degrees of freedom
## Multiple R-squared: 0.1142, Adjusted R-squared: 0.1139
## F-statistic: 395.7 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.102551 -0.027109 -0.005734 0.027462 0.091790
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.692115 0.009553 281.80 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.053538 0.003227 -16.59 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03591 on 3071 degrees of freedom
## Multiple R-squared: 0.08226, Adjusted R-squared: 0.08196
## F-statistic: 275.3 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.065242 -0.016374 -0.002281 0.017201 0.067778
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.80554 0.02663 30.25 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.21786 0.01051 20.73 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02183 on 3071 degrees of freedom
## Multiple R-squared: 0.1228, Adjusted R-squared: 0.1225
## F-statistic: 429.9 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.100185 -0.029629 -0.004341 0.026739 0.096462
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.702994 0.015659 172.6 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.068144 0.006308 -10.8 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0368 on 3071 degrees of freedom
## Multiple R-squared: 0.03661, Adjusted R-squared: 0.0363
## F-statistic: 116.7 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.087069 -0.016695 0.004296 0.015469 0.061586
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.495439 0.009601 155.75 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.055577 0.003868 -14.37 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02256 on 3071 degrees of freedom
## Multiple R-squared: 0.063, Adjusted R-squared: 0.0627
## F-statistic: 206.5 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.090643 -0.025600 -0.005485 0.027094 0.095963
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.262317 0.021113 107.15 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.077211 0.005998 12.87 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03652 on 3071 degrees of freedom
## Multiple R-squared: 0.0512, Adjusted R-squared: 0.05089
## F-statistic: 165.7 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.08378 -0.01672 0.00189 0.01633 0.06534
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.410630 0.006125 230.291 <0.0000000000000002 ***
## log(train[[coin_x]]) -0.017956 0.002069 -8.679 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02303 on 3071 degrees of freedom
## Multiple R-squared: 0.02394, Adjusted R-squared: 0.02362
## F-statistic: 75.32 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-02-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 7 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_REP USDT_ETH -4.735592
## 2 USDT_REP USDT_DASH -4.488778
## 3 USDT_REP USDT_BTC -4.291817
## 4 USDT_REP USDT_XMR -4.282674
## 5 USDT_REP USDT_ZEC -4.263626
## 6 USDT_REP USDT_LTC -4.182799
## 7 USDT_LTC USDT_XMR -3.660178
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.152349 -0.034262 -0.004368 0.026838 0.206739
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.24802 0.02643 9.384 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.52599 0.01147 45.850 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05123 on 3071 degrees of freedom
## Multiple R-squared: 0.4064, Adjusted R-squared: 0.4062
## F-statistic: 2102 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.11249 -0.03994 -0.00350 0.02961 0.23109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.614560 0.023326 26.35 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.324103 0.008944 36.24 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05565 on 3071 degrees of freedom
## Multiple R-squared: 0.2995, Adjusted R-squared: 0.2993
## F-statistic: 1313 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.187670 -0.025774 0.001334 0.040564 0.195610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.53935 0.10271 -5.251 0.000000161 ***
## log(train[[coin_x]]) 0.29335 0.01508 19.458 < 0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06273 on 3071 degrees of freedom
## Multiple R-squared: 0.1098, Adjusted R-squared: 0.1095
## F-statistic: 378.6 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.171569 -0.027241 0.006699 0.043803 0.207157
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.04220 0.02432 42.86 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.16390 0.00955 17.16 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06351 on 3071 degrees of freedom
## Multiple R-squared: 0.08752, Adjusted R-squared: 0.08722
## F-statistic: 294.5 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.176949 -0.029198 0.006601 0.040602 0.231768
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.90290 0.05374 16.80 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.14702 0.01420 10.35 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06536 on 3071 degrees of freedom
## Multiple R-squared: 0.03372, Adjusted R-squared: 0.0334
## F-statistic: 107.2 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.185246 -0.029849 -0.003585 0.038725 0.276165
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.61473 0.02559 63.109 < 0.0000000000000002 ***
## log(train[[coin_x]]) -0.11182 0.01836 -6.091 0.00000000126 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06609 on 3071 degrees of freedom
## Multiple R-squared: 0.01194, Adjusted R-squared: 0.01161
## F-statistic: 37.1 on 1 and 3071 DF, p-value: 0.000000001262
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.108888 -0.029757 -0.009282 0.029736 0.122705
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.323961 0.015689 20.65 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.420028 0.006162 68.16 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04098 on 3071 degrees of freedom
## Multiple R-squared: 0.6021, Adjusted R-squared: 0.6019
## F-statistic: 4646 on 1 and 3071 DF, p-value: < 0.00000000000000022
plot_many(pricing_data = pricing_data,
time_resolution = time_resolution,
cutoff_date = "2017-01-01",
train_window = train_window,
test_window = test_window,
threshold_z = threshold_z) ## # A tibble: 10 x 3
## coin_y coin_x adf_stat
## <chr> <chr> <dbl>
## 1 USDT_REP USDT_ETH -5.179556
## 2 USDT_REP USDT_XMR -4.969760
## 3 USDT_REP USDT_ZEC -4.911000
## 4 USDT_REP USDT_LTC -4.743256
## 5 USDT_REP USDT_BTC -4.470370
## 6 USDT_REP USDT_DASH -4.430406
## 7 USDT_ETH USDT_REP -3.687047
## 8 USDT_LTC USDT_XMR -3.637524
## 9 USDT_BTC USDT_XMR -3.571980
## 10 USDT_XMR USDT_BTC -3.476249
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.40720 -0.06478 -0.01449 0.05821 0.26591
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.48791 0.05090 -9.586 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.79247 0.02474 32.033 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08453 on 3071 degrees of freedom
## Multiple R-squared: 0.2505, Adjusted R-squared: 0.2502
## F-statistic: 1026 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.51019 -0.06005 0.00946 0.06368 0.24738
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.682004 0.021230 32.12 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.210199 0.009678 21.72 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09091 on 3071 degrees of freedom
## Multiple R-squared: 0.1332, Adjusted R-squared: 0.1329
## F-statistic: 471.7 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.56757 -0.02793 0.00237 0.04551 0.27726
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.112144 0.035064 3.198 0.0014 **
## log(train[[coin_x]]) 0.266615 0.009071 29.392 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08626 on 3071 degrees of freedom
## Multiple R-squared: 0.2195, Adjusted R-squared: 0.2193
## F-statistic: 863.9 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.50754 -0.05555 0.01447 0.04996 0.23690
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.60732 0.02809 21.62 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.39541 0.02075 19.06 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09233 on 3071 degrees of freedom
## Multiple R-squared: 0.1058, Adjusted R-squared: 0.1055
## F-statistic: 363.2 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.53972 -0.06862 -0.00017 0.06507 0.26323
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.39424 0.13783 2.860 0.00426 **
## log(train[[coin_x]]) 0.11149 0.02056 5.424 0.0000000629 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09718 on 3071 degrees of freedom
## Multiple R-squared: 0.009488, Adjusted R-squared: 0.009165
## F-statistic: 29.42 on 1 and 3071 DF, p-value: 0.00000006294
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.53979 -0.07294 -0.00371 0.06712 0.26543
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.95111 0.05238 18.157 < 0.0000000000000002 ***
## log(train[[coin_x]]) 0.08477 0.02328 3.641 0.000276 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09743 on 3071 degrees of freedom
## Multiple R-squared: 0.004298, Adjusted R-squared: 0.003974
## F-statistic: 13.26 on 1 and 3071 DF, p-value: 0.0002761
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.215765 -0.031801 0.007254 0.038373 0.131981
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.695565 0.011305 149.98 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.316041 0.009866 32.03 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05338 on 3071 degrees of freedom
## Multiple R-squared: 0.2505, Adjusted R-squared: 0.2502
## F-statistic: 1026 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.097006 -0.032956 -0.009391 0.028709 0.156896
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.469397 0.009839 47.71 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.403345 0.004485 89.92 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.04213 on 3071 degrees of freedom
## Multiple R-squared: 0.7248, Adjusted R-squared: 0.7247
## F-statistic: 8086 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.114979 -0.018137 -0.005200 0.009541 0.093811
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.686288 0.007575 750.7 <0.0000000000000002 ***
## log(train[[coin_x]]) 0.465448 0.003453 134.8 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03244 on 3071 degrees of freedom
## Multiple R-squared: 0.8554, Adjusted R-squared: 0.8554
## F-statistic: 1.817e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
##
## Call:
## lm(formula = log(train[[coin_y]]) ~ log(train[[coin_x]]))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.181951 -0.033320 -0.000486 0.032685 0.215079
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.13423 0.09142 -110.9 <0.0000000000000002 ***
## log(train[[coin_x]]) 1.83783 0.01363 134.8 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06445 on 3071 degrees of freedom
## Multiple R-squared: 0.8554, Adjusted R-squared: 0.8554
## F-statistic: 1.817e+04 on 1 and 3071 DF, p-value: < 0.00000000000000022
cutoff_dates <- seq(ymd("2017-01-01"), ymd("2017-10-01"), by = test_by)
results <- tibble()
for (cutoff_date in cutoff_dates) {
cutoff_date <- as.Date(cutoff_date)
print(str_c("Cross validating strategy."))
print(str_c("Using train set from ", cutoff_date - train_window , " to ", cutoff_date, "."))
print(str_c("Using test set from ", cutoff_date, " to ", cutoff_date + test_window, "."))
train <- prepare_data(pricing_data = pricing_data,
time_resolution = time_resolution,
start_date = cutoff_date - train_window,
end_date = cutoff_date)
test <- prepare_data(pricing_data = pricing_data,
time_resolution = time_resolution,
start_date = cutoff_date,
end_date = cutoff_date + test_window)
test <- test %>%
mutate(return_strategy =
backtest_strategy(train = train,
test = test,
selected_pairs = select_pairs(train = train, coin_pairs = create_pairs(quote_currency = quote_currency)),
threshold_z = threshold_z),
return_strategy_change = return_strategy / lag(return_strategy, 1) - 1) %>%
mutate_all(funs(ifelse(is.na(.), 0, .)))
results <- bind_rows(results, test)
} ## [1] "Cross validating strategy."
## [1] "Using train set from 2016-11-30 to 2017-01-01."
## [1] "Using test set from 2017-01-01 to 2017-01-17."
## [1] "Cross validating strategy."
## [1] "Using train set from 2016-12-16 to 2017-01-17."
## [1] "Using test set from 2017-01-17 to 2017-02-02."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-01-01 to 2017-02-02."
## [1] "Using test set from 2017-02-02 to 2017-02-18."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-01-17 to 2017-02-18."
## [1] "Using test set from 2017-02-18 to 2017-03-06."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-02-02 to 2017-03-06."
## [1] "Using test set from 2017-03-06 to 2017-03-22."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-02-18 to 2017-03-22."
## [1] "Using test set from 2017-03-22 to 2017-04-07."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-03-06 to 2017-04-07."
## [1] "Using test set from 2017-04-07 to 2017-04-23."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-03-22 to 2017-04-23."
## [1] "Using test set from 2017-04-23 to 2017-05-09."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-04-07 to 2017-05-09."
## [1] "Using test set from 2017-05-09 to 2017-05-25."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-04-23 to 2017-05-25."
## [1] "Using test set from 2017-05-25 to 2017-06-10."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-05-09 to 2017-06-10."
## [1] "Using test set from 2017-06-10 to 2017-06-26."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-05-25 to 2017-06-26."
## [1] "Using test set from 2017-06-26 to 2017-07-12."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-06-10 to 2017-07-12."
## [1] "Using test set from 2017-07-12 to 2017-07-28."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-06-26 to 2017-07-28."
## [1] "Using test set from 2017-07-28 to 2017-08-13."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-07-12 to 2017-08-13."
## [1] "Using test set from 2017-08-13 to 2017-08-29."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-07-28 to 2017-08-29."
## [1] "Using test set from 2017-08-29 to 2017-09-14."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-08-13 to 2017-09-14."
## [1] "Using test set from 2017-09-14 to 2017-09-30."
## [1] "Cross validating strategy."
## [1] "Using train set from 2017-08-29 to 2017-09-30."
## [1] "Using test set from 2017-09-30 to 2017-10-16."
results <- results %>%
mutate(return_strategy_cumulative = cumprod(1 + return_strategy_change),
date_time = as.POSIXct(date_time, origin = "1970-01-01"))
ggplot(results, aes(x = date_time)) +
geom_line(aes(y = return_strategy_cumulative), colour = "blue", size = 1) +
geom_hline(yintercept = 1, colour = "black") +
labs(title = "Strategy Return vs Buy Hold Return", x = "Date", y = "Cumulative Return") print(results[["return_strategy_cumulative"]][nrow(results)]) ## [1] 1.119813